Principled Induction of Phrasal Bilexica

نویسندگان

  • Markus Saers
  • Dekai Wu
چکیده

We aim to replace the long and complicated, pipeline employed to produce probabilistic phrasal bilexica with a theoretically principled, grammar based, approach. To this end, we introduce a learning regime to learn a phrasal grammar equivalent to linear transduction grammars. The stochastic version of this new grammar type also has the property that the set of biterminals constitute a natural probability distribution, making it similar to a probabilistic translation lexicon. Since we learn a phrasal grammar, we are, in effect, learning a probabilistic phrasal bilexicon. As a proof of concept, we show that phrasal bilexica, induced in this manner, can be used to improve the performance of a traditional phrase-based SMT system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Translation with Grammar Driven Probabilistic Phrasal Bilexica Extraction

We introduce a new type of transduction grammar that allows for learning of probabilistic phrasal bilexica, leading to a significant improvement in spoken language translation accuracy. The current state-of-the-art in statistical machine translation relies on a complicated and crude pipeline to learn probabilistic phrasal bilexica—the very core of any speech translation system. In this paper, w...

متن کامل

The Effect of Conceptual Metaphor Awareness on Learning Phrasal Verbs by Iranian Intermediate EFL Learners

The ability to comprehend and produce phrasal verbs, as lexical chunks or groups of words which are commonly found together, is an important part of language learning. This study investigates the effect of ‘conceptual metaphor awareness’, as a newly developed technique in Cognitive Linguistics, on learning phrasal verbs by Iranian intermediate EFL learners. To meet this objective, two intact ho...

متن کامل

From Finite-State to Inversion Transductions: Toward Unsupervised Bilingual Grammar Induction

We report a wide range of comparative experiments establishing for the first time contrastive foundations for a completely unsupervised approach to bilingual grammar induction that is cognitively oriented toward early category formation and phrasal chunking in the bootstrapping process up the expressiveness hierarchy from finite-state to linear to inversion transduction grammars. We show a cons...

متن کامل

A principled Cognitive Linguistics account of English phrasal verbs with up and out *

Many attempts have been made to discover some systematicity in the semantics of phrasal verbs. However, most research has investigated the semantics of particles exclusively; no study has examined how the multiple meanings of the verb also contribute to the meanings of phrasal verbs. The current corpus-based (COCA) study advances the research on phrasal verbs by examining the interaction of the...

متن کامل

Approach to Automatic Translation Template Acquisition Based on Unannotated Bilingual Grammar Induction

In this paper, we propose a new approach which can automatically acquire translation templates from the unannotated bilingual spoken language corpora in the domain of travel information accessing. In the approach, two basic algorithms named grammar induction algorithm and dynamic programming algorithm are adopted. Our approach is an unsupervised, statistical, data-driven method which avoids the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011